PAC Reinforcement Learning Algorithm for General-Sum Markov Games
نویسندگان
چکیده
This article presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms Markov games. Using the idea of delayed Q-learning, this extends well-known Nash Q-learning algorithm to build new PAC MARL general-sum In addition guiding design provably algorithm, enables checking whether an arbitrary is PAC. Comparative numerical results demonstrate algorithm's performance and robustness.
منابع مشابه
Multiagent reinforcement learning: algorithm converging to Nash equilibrium in general-sum discounted stochastic games
Reinforcement learning turned out a technique that allowed robots to ride a bicycle, computers to play backgammon on the level of human world masters and solve such complicated tasks of high dimensionality as elevator dispatching. Can it come to rescue in the next generation of challenging problems like playing football or bidding on virtual markets? Reinforcement learning that provides a way o...
متن کاملTaking turns in general sum Markov games
This paper provides a novel approach to multi-agent coordination in general sum Markov games. Contrary to what is common in multi-agent learning, our approach does not focus on reaching a particular equilibrium between agent policies. Instead, it learns a basis set of special joint agent policies, over which it can randomize to build different solutions. The main idea is to tackle a Markov game...
متن کاملQL2, a simple reinforcement learning scheme for two-player zero-sum Markov games
Markov games are a framework which formalises n-agent reinforcement learning. For instance, Littman proposed the minimax-Q algorithm to model two-agent zero-sum problems. This paper proposes a new simple algorithm in this framework, QL2, and compares it to several standard algorithms (Q-learning, Minimax and minimax-Q). Experiments show that QL2 converges to optimal mixed policies, as minimax-Q...
متن کاملHierarchical Multiagent Reinforcement Learning in Markov Games
Interactions between intelligent agents in multiagent systems can be modeled and analyzed by using game theory. The agents select actions that maximize their utility function so that they also take into account the behavior of the other agents in the system. Each agent should therefore utilize some model of the other agents. In this paper, the focus is on the situation which has a temporal stru...
متن کاملValue-function reinforcement learning in Markov games
Markov games are a model of multiagent environments that are convenient for studying multiagent reinforcement learning. This paper describes a set of reinforcement-learning algorithms based on estimating value functions and presents convergence theorems for these algorithms. The main contribution of this paper is that it presents the convergence theorems in a way that makes it easy to reason ab...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Automatic Control
سال: 2023
ISSN: ['0018-9286', '1558-2523', '2334-3303']
DOI: https://doi.org/10.1109/tac.2022.3219340